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Stealth Attacks on the Smart Grid 


Ke Sun, Ifaki Esnaola, Samir M. Perlaza, and H. Vincent Poor 


Abstract 


Random attacks that jointly minimize the amount of information acquired by the operator about 
the state of the grid and the probability of attack detection are presented. The attacks minimize the 
information acquired by the operator by minimizing the mutual information between the observations 
and the state variables describing the grid. Simultaneously, the attacker aims to minimize the probability 
of attack detection by minimizing the Kullback-Leibler (KL) divergence between the distribution when 
the attack is present and the distribution under normal operation. The resulting cost function is the 
weighted sum of the mutual information and the KL divergence mentioned above. The trade-off between 
the probability of attack detection and the reduction of mutual information is governed by the weighting 
parameter on the KL divergence term in the cost function. The probability of attack detection is evaluated 
as a function of the weighting parameter. A sufficient condition on the weighting parameter is given 
for achieving an arbitrarily small probability of attack detection. The attack performance is numerically 


assessed on the IEEE 30-Bus and 118-Bus test systems. 


Index Terms 


Stealth, data injection attacks, information-theoretic security, mutual information, probability of 


detection 


I. INTRODUCTION 


The smart grid relies on the effective integration of the power grid and advanced com- 


munication and sensing infrastructure. Consistency between the physical layer of the power 
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grid and the energy management system (EMS) in the cyber layer facilitates an economic 
and reliable operation of the power system. The 2003 North American outage caused by an 
alarm system failure [1] and the 2015 Ukraine power failure caused by the BlackEnergy virus 
incident [2] emphasize the need for cybersecurity mechanisms for the power system. However, 
the cybersecurity threats to which the smart grid is exposed are not well understood yet, and 
therefore, practical security solutions need to come forth as a multidisciplinary effort combining 
technologies such as cryptography, machine learning, and information-theoretic security [3]. 

Data injection attacks (DIAs) have emerged as a major source of concern and exemplify the 
type of cybersecurity threats that specifically target power systems [4]. DIAs manipulate the 
state estimation process in the EMS by altering the measurements of the state variables without 
triggering the bad data detection mechanism put in place by the operator. In [4] it is shown that 
attacks that lie in the column space of the Jacobian measurement matrix are undetectable by 
testing the residual. To decrease the number of the sensors that need to be compromised by the 
attacker while remaining undetectable, the {, norm of the attack vector is used as minimization 
objective yielding sparse attack in [5], [6], [7] and [8]. The case in which sparse attacks are 
constructed in a distributed setting with multiple attackers is discussed in [9] and [10]. 

The complex nature of the power system leads naturally to a stochastic modelling of the 
state variables describing the grid. For instance, the state variables of low voltage distribution 
systems are well described as following a multivariate Gaussian distribution [11]. DIAs within 
a Bayesian framework with minimum mean square error (MMSE) estimation are studied in [12] 
for the centralized case and in [13] for the distributed case. However, the fundamental limits 
governing the performance of attacks in the smart grid are not well understood yet. 

Information-theoretic tools are well suited to analyze power system by leveraging the stochastic 
description of the state variables. A sensor placement strategy that accounts for the amount of 
information acquired by the sensing infrastructure is studied in [14]. Information-theoretic privacy 
guarantees for smart meter users are proposed in [15], [16], [17] for memoryless stochastic 
processes and in [18] for general random processes. In [19], stealth Gaussian DIA constructions 
are studied in terms of information measures that quantify the information loss and the probability 
of attack detection induced by the attack. Therein, the proposed cost function gives the same 
weight to the information loss and the probability of detection which results in the effective 
secrecy framework proposed by [20] in the context of stealth communications. Stealth DIA 


constructions are also studied in [5], [21] for the case in which the detection is based on the 


residual and in a Bayesian hypothesis testing framework in [22]. The approaches in [5] and 
[21] consider the minimum cost of compromising the meters and the communication substation, 
respectively. On the other hand, [22] focuses on the delay between the time of attacker launching 
the attack and the time of operator detecting the attack. 

In this paper, the stealth attacks in [19] are generalized by introducing a weight parameter 
to the objective describing the probability of detection, which allows the attacker to construct 
attacks with arbitrarily low probability of detection. Operating under the assumption that the state 
variables are described by a multivariate Gaussian distribution [12], [13], we characterize the 
optimal Gaussian generalized stealth attacks. Since the performance of the attacks depends on 
the weighting parameter governing the probability of detection, we provide a sufficient condition 
on the weighting parameter that achieves a desired probability of attack detection. To this 
end, we characterize the probability of attack detection via an upper bound which leverages 
a concentration inequality in [23]. 

The organization of the rest paper is shown as following: In Section II, a Bayesian framework 
with linearized dynamics for DIA is introduced. The generalized stealth attack construction 
and performance analysis are presented in Section III. Section IV provides the probability of 
detection of the generalized stealth attack, and the concentration inequality based upper bound 
for probability of detection. Section V verifies the results of Section III and Section IV on IEEE 


Test System. The paper ends with conclusions in Section VI. 


II. SYSTEM MODEL 
A. Bayesian Framework with Linearized Dynamics 


The measurement model for state estimation with linearized dynamics is given by 
YS NE (1) 


where Y™ € R™ is a vector of random variables describing the measurements; X N eR isa 
vector of random variables describing the state variables; H € R”*N is the linearized Jacobian 
measurement matrix which is determined by the power network topology and the admittances 
of the branches; and Z™ € R™ is the additive white Gaussian noise (AWGN) with distribution 


N (0, o°Im) that is introduced by the sensors as a result of the thermal noise [24], [25]. In the 


remaining of the paper, we assume that the vector of the state variables follows a multivariate 


Gaussian distribution given by 
XY ~ N(0,Zxx), (2) 


where Yxx € Sh is the covariance matrix of the distribution of the state variables and S^ 
denotes the set of positive semidefinite matrices of size N x N. As a result of the linearized 
dynamic in (1), the vector of measurements also follows a multivariate Gaussian distribution 


denoted by 
Y" ~ N(0, Eyy), (3) 


where Nyy = HY yyH! + oI, is the covariance matrix of the distribution of the vector of 
measurements. 
Data injection attacks corrupt the measurements available to the operator by adding an attack 


vector to the measurements. The resulting vector of compromised measurements is given by 
YM = HX +7" + AM, (4) 


where AM € R™ is the attack vector and Y À € R™ is the vector containing the compromised 
measurements [4]. Given the stochastic nature of the state variables, it is reasonable for the 
attacker to pursue a stochastic attack construction strategy. In the following, an attack vector 
which is independent of the state variables is constructed under the assumption that the attack 


vector follows a multivariate Gaussian distribution denoted by 
AT ~ N(0, £44), (5) 


where X 41 € Sv is the covariance matrix of the attack distribution. The rationale for choosing 
a Gaussian distribution for the attack vector follows from the fact that for the measurement 
model in (4) the additive attack distribution that minimizes the mutual information between the 
vector of state variables and the compromised measurements is Gaussian [26]. Because of the 


Gaussianity of the attack distribution, the vector of compromised measurements is distributed as 
Yr re N (0, Syy) (6) 


where yy, = HU xxH! + o°Im + X44 is the covariance matrix of the distribution of the 


compromised measurements. 


It is worth noting that the independence of the attack vector with respect to the state variables 
implies that constructing the attack vector does not require access to the realizations of the 
state variables. In fact, knowledge of the second order moments of the state variables and the 
variance of the AWGN introduced by the measurement process suffices to construct the attack. 
This assumption significantly reduces the difficulty of the attack construction. 

The operator of the power system makes use of the acquired measurements to detect the 


attack. The detection problem is cast as a hypothesis testing problem with hypotheses 
Ho: Y“ ~ N(0,&yy), versus (T) 
Hi: Y” ~ N(0, Eyyy). (8) 


The null hypothesis Ho describes the case in which the power system is not compromised, while 
the alternative hypothesis Hı describes the case in which the power system is under attack. 
Two types of error are considered in hypothesis testing problems, Type I error is the probability 
of a “true negative” event; and Type II error is the probability of a “false alarm” event. The 
Neyman-Pearson lemma [27] states that for a fixed probability of Type I error, the likelihood 
ratio test (LRT) achieves the minimum Type II error when compared with any other test with 
an equal or smaller Type I error. Consequently, the LRT is chosen to decide between Ho and 
Hı based on the available measurements. The LRT between Ho and Hı takes following form: 
A f yM (y) Hi 


à 9 
fyuy) Ho ‘ 


where y € R™ is a realization of the vector of random variables modelling the measurements, 


L(y) 


fym and fym denote the probability density functions (p.d.f.’s) of YX and Y™, respectively, 


and 7 is the decision threshold set by the operator to meet the false alarm constraint. 


B. Information-Theoretic Setting 


The mutual information between two random variables is a measure of the amount of in- 
formation that each random variable contains about the other random variable. Consequently, 
the amount of information that the vector of measurements contains about the vector of state 
variables is determined by the mutual information between the vector of state variables and 
the vector of measurements. The Kullback-Leibler (KL) divergence between two probability 
distributions is a measure of the statiscal similarity between the distributions. For the hypothesis 


testing problem in (9), a small value of the KL divergence between Pym and Pym implies that 


on average the attack is unlikely to be detected by the LRT set by the attacker for a fixed value 
of T. 

The purpose of the attacker is to disrupt the normal state estimation procedure by minimizing 
the information that the operator acquires about the state variables, while guaranteeing that the 
probability of attack detection is small enough, and therefore, remain concealed in the system. 

An information-theoretic framework for the attack construction is adopted in this paper. To 
minimize the information that the operator acquires about the state variables from the measure- 
ments, the attacker minimizes the mutual information between the vector of state variables and the 
vector of compromised measurements. Specifically, the attacker aims to minimize J(XV; YA). 
On the other hand, the probability of attack detection is determined by the detection threshold 
T set by the operator and the distribution induced by the attack on the vector of compromised 
measurements. An analytical expression of the probability of attack detection can be described in 
closed-form as a function of the distributions describing the measurements under both hypotheses. 
However, the expression is involved in general and it is not straightforward to incorporate it into 
an analytical formulation of the attack construction. For that reason, we instead consider the 
asymptotic performance of the LRT to evaluate the detection performance of the operator. The 
Chernoff-Stein lemma [28] characterizes the asymptotic exponent of the probability of detection 
when the number of observations of measurement vectors grows to infinity. In our setting, the 


Chernoff-Stein lemma states that for any LRT and € € (0, 1/2), it holds that 


1 
lim — log fou = —D(Pym||Py™), (10) 


noo n 
where D(-||-) is the KL divergence, 3° is the minimum Type II error such that the Type I error a 
satisfies a < €, and n is the number of M -dimensional measurement vectors that are available for 
the LRT. Therefore, for the attacker, minimizing the asymptotic detection probability is equivalent 
to minimizing D(Pyu||Pyu), where Pym and Pym denote the probability distributions of ym 


and Y™, respectively. 


Ill. INFORMATION-THEORETIC ATTACK 
A. Generalized Stealth Attacks 


When these two information-theoretic objectives are considered by the attacker, [19] proposes 


an stealthy attack construction that combines the two objectives in one cost function, 1.e., 


I(X;Y )+D(Pym||Pym)=D(Pynym||Pyn Pym), (11) 


where Pynym is the joint distribution of X^ and Y. The resulting optimization problem to 


construct the attack is given by 
min D(PxxyullPxn Pym). (12) 


Therein, it is shown that (12) is a convex optimization problem and the covariance matrix of 
the optimal Gaussian attack is £44 = HE xx H". However, numerical simulations on IEEE test 
system show that the attack construction proposed above yields large values of probability of 
detection in practical settings. 

To address the issue of high probability of detection, in the following we propose an attack 
construction strategy that tunes the probability of detection with a parameter that weights the 


detection term in the cost function. The resulting optimization problem is given by 


min I(XY;, Y) + AD(Pym|| Pym), (13) 


where \ > 1 governs the weight given to each objective in the cost function. It is interesting 
to note that for the case in which À = 1 the proposed cost function boils down to the effective 
secrecy proposed in [20] and the attack construction in (13) coincides with that in [19]. For 
À > 1, the attacker adopts a conservative approach and prioritizes remaining undetected over 
minimizing the amount of information acquired by the operator. By increasing the value of À 
the attacker decreases the probability of detection at the expense of increasing the amount of 


information acquired by the operator via the measurements. 


B. Optimal Attack Construction 
The attack construction in (13) is formulated in a general setting. The following propositions 


particularize the KL divergence and MI to our multivariate Gaussian setting. 


Proposition 1. [28] The KL divergence between M-dimensional multivariate Gaussian distri- 


butions N (0, Xy,y,) and N(0, Nyy) is given by 


1 D — 
D(Pym||Pym) = 5 (s a M + tr EE). (14) 
AYA 


Proposition 2. [28] The mutual information between the vectors of random variables XN ~ 
N(0,Zxx) and YM! ~ N(0, Xy.v,) is given by 


1, |Dxx||z 
I(XY;Y¥) = log! aio (15) 


where © is the covariance matrix of the joint distribution of (X^, YX). 


Substituting (14) and (15) in (13) we can now pose the Gaussian attack construction as the 


following optimization problem: 


min —( —1)log |Zyy + Daal — log|Z44 + ol 
DEN 


+ EE 44). (16) 
We now proceed to solve the optimization problem above. First, note that the optimization 
domain S is a convex set. The following proposition characterizes the convexity of the cost 


function. 
Proposition 3. Let À > 1. Then the cost function in the optimization problem in (16) is convex. 


Proof. Note that the term — log |£ 44 + o7Iy| is a convex function on X44 € Se [29]. 
Additionally, —(A — 1) log |Syy + S44] is a convex function on X44 € Si when À > 1. 


Since the trace operator is a linear operator and the sum of convex functions is convex, it 


follows that the cost function in (16) is convex on My, € S™ ; 


Theorem 1. Let À > 1. Then the solution to the optimization problem in (16) is 


1 
Dis SHExH (17) 


Proof. Denote the cost function in (16) by f(Z 41). Taking the derivative of the cost function 


with respect to X 44 yields 


of 
OFEA) __ o — 1)(Zyy + Za)” — 224 + lu)” 
OZ Aa 
+2 DH + (À — 1)diag((Syy + Z44) 9 
+diag((E 4 + o°Im)™)) — Adiag(Zy;.)). (18) 
Note that the only critical point is 4%, = HE xx HT. Theorem 1 follows immediately from 


combining this result with Proposition 3. 


Corollary 1. The mutual information between the vector of state variables and the vector of 


compromised measurements induced by the optimal attack construction is given by 


TEE) 
1 —1 
H>yyH! (ot + HSH") m e 


1 
E log : (19) 


Theorem 1 shows that the generalized stealth attacks share the same structure of the stealth 
attacks in [19] up to a scaling factor determined by À. The solution in Theorem | holds for the 
case in which À > 1, and therefore, lacks full generality. However, the case in which À < 1 
yields unreasonably high probability of detection [19] which indicates that the proposed attack 
construction is indeed of practical interest in a wide range of state estimation settings. 

The resulting attack construction is remarkably simple to implement provided that the infor- 
mation about the system is available to the attacker. Indeed, the attacker only requires access 
to the linearized Jacobian measurement matrix H and the second order statistics of the state 
variables, but the variance of the noise introduced by the sensors is not necessary. To obtain the 
Jacobian, a malicious attacker needs to know the topology of the grid, the admittances of the 
branches, and the operation point of the system. The second order statistics of the state variables 
on the other hand, can be estimated using historical data. In [19] it is shown that the attack 
construction with a sample covariance matrix of the state variables obtained with historical data 
is asymptotically optimal when the size of the training data grows to infinity. 

It is interesting to note that the mutual information increases monotonically with and that 
it asymptotically converges to [(X‘;Y™), i.e. the case in which there is no attack. While the 
evaluation of the mutual information as shown in Corollary 1 is straightforward, the computation 
of the associated probability of detection yields involved expressions that do not provide much 
insight. For that reason, the probability of detection of optimal attacks is treated in the following 


section. 


IV. PROBABILITY OF DETECTION OF GENERALIZED STEALTH ATTACKS 


The asymptotic probability of detection of the generalized stealth attacks characterized in 
Section III-B is governed by the KL divergence as described in (10). However in the non- 
asymptotic case, determining the probability of detection is difficult, and therefore, choosing a 
value of À that provides the desired probability of detection is a challenging task. In this section 
we first provide a closed-form expression of the probability of detection by direct evaluation and 
show that the expression does not provide any practical insight over the choice of À that achieves 
the desired detection performance. That being the case, we then provide an upper bound on the 
probability of detection, which, in turn, provides a lower bound on the value of À that achieves 


the desired probability of detection. 


A. Direct Evaluation of the Probability of Detection 


Detection based on the LRT with threshold 7 yields a probability of detection given by 


Ag 
Pp = H Meas > (20) 


where 1,., is the indicator function. The following proposition particularizes the above expression 


to the optimal attack construction described in Section III-B. 


Lemma 1. The probability of detection of the LRT in (9) for the attack construction in (17) is 
given by 


Po(À) =P [ua av? > À (2log 7 + log [Ip + \1A))| ; (21) 


where P = rank(HXxxH"), U P € R? is a vector of random variables with distribution 


N(0,Ip), and A € RP*P is a diagonal matrix with entries given by (A);; = A (HE xx HDA: (E), 


where \;(A) with i = 1,...,P denotes the i-th eigenvalue of matrix A in descending order. 


Proof. The probability of detection of the stealth attack is, 


Pp(A)= J dPym (22) 
S 
1 1 
Saar T i exp -i Eur) dy, (23) 
(2x)? [Evry l?9s 
where 
S = {y ER” : L(y) > 7}. (24) 


Algebraic manipulation yields the following equivalent description of the integration domain: 
S= {y E€ R”: y'Aoy >2 log r+log |Im + Zay} | (25) 


with Ao 2 pa — D Let Zyy = Uyy Ayy Uly where Ayy € RV*M is a diagonal matrix 
containing the eigenvalues of Nyy in descending order and Uyy € R™™*™ is a unitary matrix 
whose columns are the eigenvectors of Xyy ordered matching the order of the eigenvalues. 
Applying the change of variable y: 2 Uyyy in (23) results in 
Po > [exp {vain} dy. (26) 
(2n)¥ [Syl 2 


where Ay,y, € R“*™ denotes the diagonal matrix containing the eigenvalues of Xy,y, in de- 


scending order. Noticing that Nyy, © 44 and Nyy, are also diagonalized by Uyy, the integration 


domain Sj is given by 


Sı={y ER": yl Ayı >2log r+log [Ivt+AasAyy|} f (27) 


A = ; ; ; . Ii . 
where A, = Ap — AG, with A 44 denoting the diagonal matrix containing the eigenvalues of 


X 44 in descending order. Further applying the change of variable y2 2 Ayy, yı in (26) results 


in 
Po(\) == |, exp{—jylvald 08) 
SS] eR Se , 
D VC)" So p 9Y2Y2 V2 
with the transformed integration domain given by 
S2={y2 € RM: y2A2yo > 2logr+log |Im+A2l} , (29) 
with 
Ao = Aas AZ. (30) 


Setting A 2 \Ay and noticing that rank(A) = rank(HxxH") concludes the proof. 


Notice that the left-hand term (UV)TAUM in (21) is a weighted sum of independent y? 
distributed random variables with one degree of freedom where the weights are determined by 
the diagonal entries of A which depend on the second order statistics of the state variables, the 
Jacobian measurement matrix, and the variance of the noise; i.e. the attacker has no control over 
this term. The right-hand side contains in addition À and 7, and therefore, the probability of attack 
detection is described as a function of the parameter À. However, characterizing the distribution 
of the resulting random variable is not practical since there is no closed-form expression for 
the distribution of a positively weighted sum of independent y? random variables with one 
degree of freedom [30]. Usually, some moment matching approximation approaches such as 
the Lindsay—Pilla—Basak (LPB) method [31] are utilized to solve this problem but the resulting 
expressions are complex and the relation of the probability of detection with À is difficult to 
describe analytically following this course of action. In the following an upper bound on the 
probability of attack detection is derived. The upper bound is then used to provide a simple 


lower bound on the value that achieves the desired probability of detection. 


B. Upper Bound on the Probability of Detection 


The following theorem provides a sufficient condition for À to achieve a desired probability 


of attack detection. 
Theorem 2. Let T > 1 be the decision threshold of the LRT. For any t > 0 and À > max (X*(t), 1) 
then the probability of attack detection satisfies 


Po(\) Se“, (31) 


where \*(t) is the only positive solution of À satisfying 
Mer StA?) — 2y/te(A?)t—2I|A| oot = 0. (32) 
Proof. We start with the result of Lemma 1 which gives 
Po(A)=P|(U?) AUP > (2logr+log Ip +2") | (33) 


We now proceed to expand the term log |Ip + \~'A| using a Taylor series expansion resulting 


in 
log |Ip + A" A| 
P 
= ye log (1 +A7"(A)i,) (34) 
i=1 
ve (Ss (Q72@w OFA)” a 
7 2 2 2j—1 2j | ox 


Since (A);; < 1,for i = 1,..., P, and A > 1, then 
ATMA) a) ATMA)” 


= An 
251 2 > 0, for] eZ (36) 
Thus, (35) is lower bounded by the second order Taylor expansion, 1.e., 
P —1 2 
=i (AT (Ai) 
log |Ip + A] > >, [ jo (37) 
= l r(A) = (A?) (38) 
à De 
Substituting (38) in (33) yields 
1 
Po(À) < PUPA za) +2Àlog tT — sa?) (39) 


Note that E |(U?)"AU”] = tr(A), and therefore, evaluating the probability in (39) is equivalent 


to evaluating the probability of (UP) AUF deviating 2Alog7 — tr(A’) from the mean. In 
view of this, the right-hand side in (39) is upper bounded by [23], [32] 


Po) SPU AUP zA) feallt] (40) 
<e*, (41) 


for t > 0 satisfying 


1 2 RC 
2% > * 
2A log r 5xtr(A ) > 24/tr(A°)t+2|[A/|st (42) 
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Fig. 1. Performance of the generalized stealth attack in terms of mutual information and 


probability of detection for different values of p when À = 2, 7 = 2, and SNR = 10 dB. 


The expression in (42) is satisfied with equality for two values of À, one is strictly negative and 
the other one is strictly positive denoted by à* (t), when 7 > 1. The result follows by noticing that 


the left-hand term of (42) increases monotonically for À > 0 and choosing À > max (A*(t), 1). 


This concludes the proof. 


It is interesting to note that for large values of À the probability of detection decreases 
exponentially fast with À. We will later show in the numerical results that the regime in which the 
exponentially fast decrease kicks in does not align with the saturation of the mutual information 


loss induced by the attack. 


V. NUMERICAL SIMULATION 


In this section, we present simulations to evaluate the performance of the proposed attack 
strategy in practical state estimation settings. In particular, the IEEE 30-Bus and 118-Bus test 
systems are utilized in the simulation. In state estimation with linearized dynamics, the Jacobian 
measurement matrix is determined by the operation point. We assume a DC state estimation 
scenario [24], [25], and thus, we set the bus voltage angles to zero. Note that in this setting 
it is sufficient to specify the network topology, the branch reactances, real power flow, and the 
power injection values to fully characterize the system. Specifically, we use the IEEE test system 
framework provided by MATPOWER [33]. 

As stated in Section IV-A, there is no closed-form expression for the distribution of a positively 


weighted sum of independent y? random variables, which is required to calculate the probability 
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Fig. 2. Performance of the generalized stealth attack in terms of mutual information and 


probability of detection for different values of p when À = 2, T = 2, and SNR = 20 dB. 


of detection of the generalized stealth attacks as shown in Lemma 1. For that reason, we use 
the LPB method and the MOMENTCHI? package [34] to numerically evaluate the probability 
of attack detection. 

The simulation settings are the same as in [19]. The covariance matrix of the state variables is 
assumed to be a Toeplitz matrix with exponential decay parameter p, where the exponential decay 
parameter p determines the correlation strength between different entries of the state variable 
vector. The performance of the generalized stealth attack is a function of weight given to the 
detection term in the attack construction cost function, i.e. A, the correlation strength between 
state variables, i.e. p, and the Signal-to-Noise Ratio (SNR) of the power system which is defined 
as 


(43) 


tr(HSyyH"™ 
SNR = 10 log, (EE) l 


Mo? 

Fig. 1 and Fig. 2 depict the performance of the optimal attack construction given in (17) for 
different values of p with SNR = 10 dB and SNR = 20 dB, respectively, when À = 2 and 7 = 2. 
Interestingly, the performance of the attack construction does not change monotonically with 
the correlation strength, which suggests that the correlation among the state variables does not 
necessarily provide an advantage to the attacker. Admittedly, for a small and moderate values of p, 
the performance of the attack does not change significantly with p for both objectives. This effect 
is more noticeable in the high SNR scenario. However, for large values of p the performance 
of the attack improves significantly in terms of both mutual information and probability of 


detection. Moreover, the advantage provided by large values of p is more significant for the 
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Fig. 3. Performance of the generalized stealth attack in terms of mutual information and 
probability of detection for different values of À and system size when p = 0.1, p = 0.9, 


SNR = 10 dB and 7 = 2. 
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Fig. 4. Performance of the generalized stealth attack in terms of mutual information and 
probability of detection for different values of À and system size when p = 0.1, p = 0.9, 


SNR = 20 dB and 7 = 2. 


118-Bus system than for the 30-Bus system, which indicates that correlation between the state 
variables is easier to exploit for the attacker in large systems. 

Fig. 3 and Fig. 4 depict the performance of the optimal attack construction for different values 
of À and p with SNR = 10 dB and SNR = 20 dB, respectively, when 7 = 2. As expected, 
larger values of the parameter À yield smaller values of the probability of attack detection while 
increasing the mutual information between the state variables vector and the compromised mea- 
surement vector. We observe that the probability of detection decreases approximately linearly 


for moderate values of À. On the other hand, Theorem 2 states that for large values of À the 
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Fig. 5. Upper bound on probability of detection given in Theorem 2 for different values of À 
when p = 0.1 or 0.9, SNR = 10 dB, and 7 = 2. 


probability of detection decreases exponentially fast to zero. However, for the range of values of 
À in which the decrease of probability of detection is approximately linear, there is no significant 
reduction on the rate of growth of mutual information. In view of this, the attacker needs to 
choose the value of À carefully as the convergence of the mutual information to the asymptote 
I(X%;Y™) is slower than that of the probability of detection to zero. 

The comparison between the 30-Bus and 118-Bus systems shows that for the smaller size 
system the probability of detection decreases faster to zero while the rate of growth of mutual 
information is smaller than that on the larger system. This suggests that the choice of À is 
particularly critical in large size systems as smaller size systems exhibit a more robust attack 
performance for different values of À. The effect of the correlation between the state variables 
is significantly more noticeable for the 118-bus system. While there is a performance gain for 
the 30-bus system in terms of both mutual information and probability of detection due to the 
high correlation between the state variables, the improvement is more noteworthy for the 118- 
bus case. Remarkably, the difference in terms of mutual information between the case in which 
p = 0.1 and p = 0.9 increases as À increases which indicates that the cost in terms of mutual 
information of reducing the probability of detection is large in the small values of correlation. 

The performance of the upper bound given by Theorem 2 on the probability of detection for 
different values of À and p when T = 2 and SNR = 10 dB is shown in Fig. 5. Similarly, Fig. 6 
depicts the upper bound with the same parameters but with SNR = 20 dB. As shown by Theorem 


2 the bound decreases exponentially fast for large values of À. Still, there is a significant gap 
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Fig. 6. Upper bound on probability of detection given in Theorem 2 for different values of À 
when p = 0.1 or 0.9, SNR = 20 dB, and 7 = 2. 


to the probability of attack detection evaluated numerically. This is partially due to the fact that 
our bound is based on the concentration inequality in [23] which introduces a gap of more than 
an order of magnitude. Interestingly, the gap decreases when the value of p increases although 
the change is not significant. More importantly, the bound is tighter for lower values of SNR 


for both 30-bus and 118-bus systems. 


VI. CONCLUSIONS 


We have proposed a novel data injection attacks based on information-theoretic performance 
measures. Specifically, we have posed the attack construction problem as an optimization problem 
in which the cost function combines the mutual information and the probability of attack 
detection. The proposed cost function allows to obtain an arbitrarily small probability of attack 
detection via a parameter that weights the effect of the mutual information and the probability of 
detection. The resulting random attack construction has been analyzed in terms of the information 
loss and the probability of attack detection that it induces on the system. We have characterized 
the probability of attack detection by obtaining an easy to compute upper bound. The upper 
bound has been used to provide a practical attack construction guideline by determining the cost 


function that achieves a given probability of attack detection. 
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